[Router] Add Valkey memory backend with TLS support by daric93 · Pull Request #1739 · vllm-project/semantic-router

daric93 · 2026-04-09T22:17:15Z

Summary

Add Valkey (with Search module) as an alternative memory backend alongside Milvus, selectable via backend: valkey in config
Implement full Store interface: Store, Retrieve (vector similarity via FT.SEARCH), Get, Update, List, Forget, ForgetByScope with HNSW indexing, hybrid reranking, adaptive threshold, and access tracking
Refactor ConsolidateUser from *MilvusStore receiver to standalone function accepting Store interface for backend-agnostic consolidation
Add TLS support (tls_enabled, tls_ca_path, tls_insecure_skip_verify) using valkey-glide native TLS API
Use authoritative HINCRBY return value in recordRetrieval to keep metadata JSON access_count consistent under concurrent retrievals
Add comprehensive unit tests (707 lines), integration tests (838 lines), example configs, E2E config, CLI model, and documentation

Test plan

make test-semantic-router — all packages pass (0 failures)
make go-lint — 0 issues
make check-go-mod-tidy — clean
Integration tests against live Valkey instance — 24/24 pass
E2E test with config.memory-user-valkey.yaml profile against Valkey + Search module

netlify · 2026-04-09T22:17:21Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`8b5fbcf`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/69dfeee0d7d8ce00090fe204
😎 Deploy Preview	https://deploy-preview-1739--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

github-actions · 2026-04-09T22:17:45Z

✅ Supply Chain Security Report — All Clear

Scanner	Status	Findings
AST Codebase Scan (Py, Go, JS/TS, Rust)	✅	27 finding(s) — MEDIUM: 21 · LOW: 6
AST PR Diff Scan	✅	No issues detected
Regex Fallback Scan	✅	No issues detected

Scanned at 2026-04-15T20:03:29.031Z · View full workflow logs

Signed-off-by: Daria Korenieva <daric2612@gmail.com>

Extract TLS configuration into buildValkeyTLSConfig helper to satisfy cyclop (complexity 13 > 12) and nestif linters. Signed-off-by: Daria Korenieva <daric2612@gmail.com>

Signed-off-by: Daria Korenieva <daric2612@gmail.com>

github-actions · 2026-04-14T17:33:23Z

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 `config`

Owners: @rootfs, @Xunzhuo
Files changed:

config/config.yaml

📁 `deploy`

Owners: @rootfs, @Xunzhuo, @samzong
Files changed:

deploy/examples/runtime/README.md
deploy/examples/runtime/memory/valkey.yaml

📁 `e2e`

Owners: @Xunzhuo, @yossiovadia, @szedan-rh, @henschwartz, @mkoushni
Files changed:

e2e/config/config.memory-user-valkey.yaml

📁 `src/semantic-router`

Owners: @rootfs, @Xunzhuo, @szedan-rh, @yehuditkerido, @abdallahsamabd, @asaadbalum, @liavweiss, @noalimoy
Files changed:

src/semantic-router/pkg/config/reference_config_global_test.go
src/semantic-router/pkg/config/runtime_config.go
src/semantic-router/pkg/extproc/router_memory.go
src/semantic-router/pkg/memory/caching_store.go
src/semantic-router/pkg/memory/caching_store_test.go
src/semantic-router/pkg/memory/consolidation.go
src/semantic-router/pkg/memory/valkey_store.go
src/semantic-router/pkg/memory/valkey_store_helpers.go
src/semantic-router/pkg/memory/valkey_store_integration_test.go
src/semantic-router/pkg/memory/valkey_store_integration_validation_test.go
src/semantic-router/pkg/memory/valkey_store_test.go

📁 `src/vllm-sr`

Owners: @Xunzhuo, @szedan-rh, @yehuditkerido, @henschwartz, @mkoushni, @liavweiss, @noalimoy, @haowu1234
Files changed:

src/vllm-sr/cli/models_memory.py

📁 `tools`

Owners: @Xunzhuo, @yuluo-yx, @samzong
Files changed:

tools/make/valkey.mk

📁 `website`

Owners: @Xunzhuo, @samzong, @yuluo-yx
Files changed:

website/docs/installation/valkey-memory.md
website/docs/tutorials/global/stores-and-tools.md
website/docs/tutorials/plugin/memory.md
website/sidebars.ts

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

daric93 · 2026-04-14T23:42:56Z

@Xunzhuo This PR is ready for review. Would appreciate any feedback when you get a chance. Thanks!

Copilot

Pull request overview

Adds a Valkey (Search module) implementation of the router “memory” Store, selectable via global.stores.memory.backend: valkey, including TLS wiring, plus docs/config examples and tests to validate the new backend.

Changes:

Introduce ValkeyStore implementing the memory Store interface (vector retrieval via FT.SEARCH, HNSW index init, hybrid reranking, access tracking).
Add config surfaces for memory.backend + memory.valkey.* (Go runtime config + CLI model), and wire backend selection + TLS setup in the extproc router.
Add documentation, example configs, and unit/integration tests for the Valkey backend.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
website/sidebars.ts	Adds a docs sidebar category entry for the Valkey memory deployment guide.
website/docs/tutorials/plugin/memory.md	Documents that the memory plugin requires a configured backing store and mentions Valkey backend.
website/docs/tutorials/global/stores-and-tools.md	Adds global configuration examples for Milvus vs Valkey memory backends.
website/docs/installation/valkey-memory.md	New deployment/config/tuning/troubleshooting guide for Valkey as memory backend.
src/vllm-sr/cli/models_memory.py	Adds CLI-side Pydantic model for Valkey memory config + backend selector field.
src/semantic-router/pkg/memory/valkey_store.go	Implements Valkey-backed memory Store (index init, CRUD, retrieve, list, delete-by-scope, etc.).
src/semantic-router/pkg/memory/valkey_store_helpers.go	Helper logic (result parsing, retry/backoff, escaping, access tracking, hybrid rerank).
src/semantic-router/pkg/memory/valkey_store_test.go	Unit tests for Valkey helper behavior + TLS config fields.
src/semantic-router/pkg/memory/valkey_store_integration_test.go	Integration tests against a live Valkey + Search module instance.
src/semantic-router/pkg/memory/valkey_store_integration_validation_test.go	Integration validation tests (bad inputs, disabled-store behavior, TLS config propagation).
src/semantic-router/pkg/memory/consolidation.go	Refactors `ConsolidateUser` to be backend-agnostic via the `Store` interface.
src/semantic-router/pkg/extproc/router_memory.go	Adds backend selection (`milvus`/`valkey`), Valkey client + TLS construction, and shared Redis cache wrapping.
src/semantic-router/pkg/config/runtime_config.go	Adds `MemoryConfig.Backend` and `MemoryValkeyConfig` to runtime YAML config surface.
src/semantic-router/pkg/config/reference_config_global_test.go	Extends reference-config coverage assertions to include `global.stores.memory.valkey`.
e2e/config/config.memory-user-valkey.yaml	Adds an E2E config profile using the Valkey memory backend.
deploy/examples/runtime/memory/valkey.yaml	Adds a Valkey memory backend example configuration (including TLS knobs).
deploy/examples/runtime/README.md	Updates runtime examples README to mention memory/vector-store config references.
config/config.yaml	Extends the repo’s canonical example config to include `memory.backend` and a `memory.valkey` block.

daric93 · 2026-04-15T17:51:01Z

+	embeddingConfig := &memory.EmbeddingConfig{
+		Model:     memory.EmbeddingModelType(detectMemoryEmbeddingModel(cfg)),
+		Dimension: vc.Dimension,
+	}


Fixed. createValkeyMemoryStore now normalizes vc.Dimension before constructing the EmbeddingConfig or calling NewValkeyStore. If vc.Dimension <= 0, the dimension is derived from the resolved embedding model: 256 for mmbert, 384 for all others. This ensures the FT index dimension and the embedding dimension always agree, regardless of whether the user sets valkey.dimension explicitly.

daric93 · 2026-04-15T17:51:13Z

+	key := v.hashKey(memory.ID)
+
+	err = v.retryWithBackoff(ctx, func() error {
+		_, hsetErr := v.client.HSet(ctx, key, fields)
+		return hsetErr
+	})
+	if err != nil {
+		status = "error"
+		return fmt.Errorf("valkey HSET failed for memory id=%s: %w", memory.ID, err)
+	}


Fixed. ValkeyStore.Store now does an EXISTS check before HSET. If the key already exists it returns "memory already exists: <id>", matching the Store interface contract and the behaviour of InMemoryStore/MilvusStore. TestValkeyStoreInteg_DuplicateKeys has been updated to expect an error on the second Store call and verify the original content is unchanged.

daric93 · 2026-04-15T17:51:15Z

+	// Fetch limit+1 pages worth of data so we can sort client-side and still
+	// respect the limit. We over-fetch by a factor to allow client-side sorting
+	// by created_at (FT.SEARCH does not support ORDER BY on NUMERIC fields
+	// without SORTABLE in all Valkey Search versions).
+	// The total count comes from the FT.SEARCH header element.
+	fetchLimit := limit * 5
+	if fetchLimit < 100 {
+		fetchLimit = 100
+	}
+	if fetchLimit > 10000 {
+		fetchLimit = 10000
+	}
+
+	searchCmd := []string{
+		"FT.SEARCH", v.indexName, filterExpr,
+		"RETURN", "7", "id", "content", "user_id", "memory_type", "metadata", "created_at", "updated_at",
+		"LIMIT", "0", strconv.Itoa(fetchLimit),


Fixed. List now uses SORTBY created_at DESC with LIMIT 0 <limit> directly in the FT.SEARCH command, since created_at is already declared SORTABLE in the index schema. The over-fetch multiplier (fetchLimit = limit * 5) and the client-side sort.Slice have been removed.

daric93 · 2026-04-15T17:51:16Z

+	// Sync metadata JSON with the authoritative HASH fields.
+	// We read the current metadata, overwrite access_count and last_accessed
+	// with the values we just wrote atomically above, and write it back.
+	// This avoids the previous read-modify-write race: even if two goroutines
+	// run concurrently, each writes the post-increment count it received from
+	// HINCRBY, so the JSON converges to the correct value.
+	fields, err := v.client.HGetAll(ctx, key)
+	if err != nil {
+		return nil // Non-critical: top-level HASH fields are already updated
+	}
+	if metadataStr, ok := fields["metadata"]; ok && metadataStr != "" {
+		var metadata map[string]interface{}
+		if jsonErr := json.Unmarshal([]byte(metadataStr), &metadata); jsonErr == nil {
+			metadata["last_accessed"] = now.Unix()
+			// Use the authoritative count returned by HINCRBY instead of
+			// incrementing the stale JSON value.
+			metadata["access_count"] = valkeyToInt64(newCount)
+			if updated, mErr := json.Marshal(metadata); mErr == nil {
+				_, _ = v.client.HSet(ctx, key, map[string]string{"metadata": string(updated)})
+			}


Fixed by removing access_count from the metadata JSON entirely. The change is end-to-end: valkeyBuildHashFields no longer writes access_count into the metadata blob; valkeyApplyMetadata no longer reads it from there; valkeyFieldsToMemory reads access_count from the authoritative top-level HASH field (set atomically by HINCRBY); recordRetrieval only updates last_accessed in metadata and explicitly deletes any stale access_count key. This eliminates the race entirely without needing a Lua script.

daric93 · 2026-04-15T17:51:17Z

+	result := store
+	if rc := cfg.Memory.RedisCache; rc != nil && rc.Enabled && rc.Address != "" {
+		cacheCfg := &memory.RedisCacheConfig{
+			Address:    rc.Address,
+			Password:   rc.Password,
+			DB:         rc.DB,
+			KeyPrefix:  rc.KeyPrefix,
+			TTLSeconds: rc.TTLSeconds,
+		}
+		redisCache, err := memory.NewRedisCache(ctx, cacheCfg)
+		if err != nil {
+			logging.Warnf("Memory: Redis cache disabled (connection failed: %v)", err)
+		} else {
+			result = memory.NewCachingStore(store, redisCache)
+		}
+	}


Fixed. NewCachingStore now accepts a backendLabel string parameter. createMemoryStore passes the resolved backend name ("milvus" or "valkey") so cache hit/miss metrics are correctly labeled per backend. The CachingStore.Retrieve method uses this label when calling RecordMemoryCacheHit/RecordMemoryCacheMiss.

daric93 · 2026-04-15T17:51:18Z

+func TestValkeyStoreInteg_DuplicateKeys(t *testing.T) {
+	store, _ := setupValkeyMemoryIntegration(t)
+	ctx := context.Background()
+
+	id := fmt.Sprintf("mem_dup_%d", time.Now().UnixNano())
+
+	// Store first version
+	require.NoError(t, store.Store(ctx, &Memory{
+		ID: id, Type: MemoryTypeSemantic,
+		Content: "First version", UserID: "dup_user",
+	}))
+	time.Sleep(200 * time.Millisecond)
+
+	// Store again with same ID (HSET overwrites)
+	require.NoError(t, store.Store(ctx, &Memory{
+		ID: id, Type: MemoryTypeSemantic,
+		Content: "Second version", UserID: "dup_user",
+	}))
+	time.Sleep(200 * time.Millisecond)
+
+	// Verify latest content
+	retrieved, err := store.Get(ctx, id)
+	require.NoError(t, err)
+	assert.Equal(t, "Second version", retrieved.Content)
+}


Updated. TestValkeyStoreInteg_DuplicateKeys now expects an error on the second Store call (assert.Contains(t, err.Error(), "memory already exists")) and verifies the original content is unchanged via Get. The test comment has been updated to reflect the enforced-uniqueness semantics.

Signed-off-by: Daria Korenieva <daric2612@gmail.com>

…mantic-router into feature/valkey-memory-backend

Signed-off-by: Daria Korenieva <daric2612@gmail.com>

daric93 added 5 commits April 13, 2026 10:03

valkey memory implementation

5d596e4

Signed-off-by: Daria Korenieva <daric2612@gmail.com>

valkey memory implementation fixes and improvements

8474755

Signed-off-by: Daria Korenieva <daric2612@gmail.com>

fix(ci): replace codespell-flagged typo in valkey integration test

9b2304a

Signed-off-by: Daria Korenieva <daric2612@gmail.com>

fix(ci): fix markdown lint, Netlify build, and split oversized test file

c029bd0

Signed-off-by: Daria Korenieva <daric2612@gmail.com>

fix(ci): reduce cyclomatic complexity in createValkeyMemoryStore

aeb632c

Extract TLS configuration into buildValkeyTLSConfig helper to satisfy cyclop (complexity 13 > 12) and nestif linters. Signed-off-by: Daria Korenieva <daric2612@gmail.com>

daric93 force-pushed the feature/valkey-memory-backend branch from 57a58b7 to aeb632c Compare April 13, 2026 17:29

daric93 added 3 commits April 13, 2026 11:47

Merge branch 'main' into feature/valkey-memory-backend

95562af

added search module version explanation

2d1b9c5

Signed-off-by: Daria Korenieva <daric2612@gmail.com>

Merge branch 'main' into feature/valkey-memory-backend

72fd592

daric93 marked this pull request as ready for review April 14, 2026 17:33

daric93 requested review from Xunzhuo and rootfs as code owners April 14, 2026 17:33

github-actions bot assigned abdallahsamabd, asaadbalum, haowu1234, liavweiss, noalimoy, rootfs, samzong, szedan-rh, Xunzhuo and yehuditkerido Apr 14, 2026

rootfs requested a review from Copilot April 15, 2026 13:16

Copilot started reviewing on behalf of rootfs April 15, 2026 13:17 View session

Copilot AI reviewed Apr 15, 2026

View reviewed changes

address copilot reviews

3282005

Signed-off-by: Daria Korenieva <daric2612@gmail.com>

daric93 added 3 commits April 15, 2026 10:45

Merge branch 'feature/valkey-memory-backend' of github.com:daric93/se…

869a56f

…mantic-router into feature/valkey-memory-backend

address copilot reviews

6a6a3bf

Signed-off-by: Daria Korenieva <daric2612@gmail.com>

Merge branch 'main' into feature/valkey-memory-backend

5b51f33

rootfs approved these changes Apr 15, 2026

View reviewed changes

Merge branch 'main' into feature/valkey-memory-backend

8b5fbcf

rootfs merged commit 7a64716 into vllm-project:main Apr 15, 2026
34 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Router] Add Valkey memory backend with TLS support#1739

[Router] Add Valkey memory backend with TLS support#1739
rootfs merged 13 commits intovllm-project:mainfrom
daric93:feature/valkey-memory-backend

daric93 commented Apr 9, 2026

Uh oh!

netlify bot commented Apr 9, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 9, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

daric93 commented Apr 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

daric93 Apr 15, 2026

Uh oh!

daric93 Apr 15, 2026

Uh oh!

daric93 Apr 15, 2026

Uh oh!

daric93 Apr 15, 2026

Uh oh!

daric93 Apr 15, 2026

Uh oh!

daric93 Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

12 participants

Conversation

daric93 commented Apr 9, 2026

Summary

Test plan

Uh oh!

netlify bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

github-actions bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Supply Chain Security Report — All Clear

Uh oh!

github-actions bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👥 vLLM Semantic Team Notification

📁 config

📁 deploy

📁 e2e

📁 src/semantic-router

📁 src/vllm-sr

📁 tools

📁 website

🎉 Thanks for your contributions!

Uh oh!

daric93 commented Apr 14, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

daric93 Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

daric93 Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

daric93 Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

daric93 Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

daric93 Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

daric93 Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

12 participants

netlify bot commented Apr 9, 2026 •

edited

Loading

github-actions bot commented Apr 9, 2026 •

edited

Loading

github-actions bot commented Apr 14, 2026 •

edited

Loading

📁 `config`

📁 `deploy`

📁 `e2e`

📁 `src/semantic-router`

📁 `src/vllm-sr`

📁 `tools`

📁 `website`